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Abstract 

We consider N xN symmetric or hermitian random matrices with independent, identically distributed 
entries where the probability distribution for each matrix element is given by a measure u with a subexpo- 
nential decay. We prove that the local eigenvalue statistics in the bulk of the spectrum for these matrices 
coincide with those of the Gaussian Orthogonal Ensemble (GOE) and the Gaussian Unitary Ensemble 
(GUE), respectively, in the limit A'^ — > oo. Our approach is based on the study of the Dyson Brownian 
motion via a related new dynamics, the local relaxation flow. We also show that the Wigner semicircle 
law holds locally on the smallest possible scales and we prove that eigenvectors are fully delocalized and 
eigenvalues repel each other on arbitrarily small scales. 
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1 Introduction 

A central question concerning random matrices is the universality conjecture which states that local statistics 
of eigenvalues of large N x N square matrices H are determined by the symmetry type of the ensembles but 
are otherwise independent of the details of the distributions. 

There are two types of universalities: the edge universality and the bulk universality concerning the 
interior of the spectrum. The edge universality is commonly approached via the fairly robust moment method 
[33, 34]; very recently an alternative proof was given [36]. The bulk universality is a subtler problem. In 
the hermitian case, it states that the local fc-point correlation functions of the eigenvalues, after appropriate 
rescaling, are given by the determinant of the sine kernel 

K{xi - Xj)) K{x) = , (1.1) 

independently of the distribution of the entries. Similar statement holds for the symmetric matrices but the 
explicit formulae are somewhat more complicated. 
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For ensembles that remain invariant under the transformations H — > U*HU for any unitary matrix U, 
the joint probabihty density function of all the N eigenvalues can be explicitly computed. These ensembles 
are typically given by the probability density 

P{H)AH - exp(-iVTr\/(i7))dF 

where y is a real function with sufficient growth at infinity and AH is the flat measure. The eigenvalues 
are strongly correlated and they are distributed according to a Gibbs measure with a long range logarithmic 
interaction potential. The joint probability density of the eigenvalues of H can be computed explicitly: 

N 

/(Ai, A2, . . . A^) = const.n(A, - A.^ J] ^""^^-^ (1-2) 

i<j 3=1 

where /3 = 1 for symmetric and /3 = 2 for hermitian ensembles. The local statistics can be obtained via a 
detailed analysis of orthogonal polynomials on the real line with respect to the weight function exp(— F(a;)). 
Quadratic V corresponds to the Gaussian ensembles. This approach was originally applied [26] for ensembles 
that lead to classical orthogonal polynomials (e.g. GUE leads to Hermite polynomials). Later a general 
method using orthogonal polynomials has been developed to tackle a very general class of unitary ensembles 
(see, e.g. [5, 9, 10, 26, 29] and references therein). 

Many natural matrix ensembles are typically not unitarily invariant; the most prominent example is 
the Wigner matrices. These are symmetric or hermitian matrices whose entries above the diagonal are 
independent, identically distributed random variables. The only unitarily invariant Wigner ensembles are 
the Gaussian ensembles. For general Wigner matrices, no explicit formula is available for the joint eigenvalue 
distribution. Thus the basic algebraic connection between eigenvalue ensembles and orthogonal polynomials 
is missing and completely new methods needed to be developed. 

The bulk universality for hermitian Wigner ensembles has been established jointly with S. Peche, J. 
Ramirez, B. Schlcin and H.T, Yau, and independently by Tao-Vu [20, 35, 21]. These works rely on the 
Wigner matrices with Gaussian divisible distribution, i.e. ensembles of the form 

H + sV, (1.3) 

where iJ is a Wigner matrix, V is an independent standard GUE matrix and s is a positive constant. Jo- 
hansson [24] (see also [4]) proved bulk universality for the eigenvalues of such matrices using an explicit 
formula by Brezin-Hikami [6, 24] on the correlation functions. Unfortunately, the similar formula for sym- 
metric matrices is not very explicit and the technique of [20, 24] cannot be extended to prove universality 
for symmetric Wigner matrices. 

A key observation of Dyson is that if the parameter s in the matrix H-\-sV is varied and is interpreted as 
time, then the evolution of the eigenvalues is given by a coupled system of stochastic differential equations, 
commonly called the Dyson Brownian motion (DBM) [12]. If we replace the Brownian motions by the 
Ornstein-Uhlenbeck processes to keep the variance constant, then the resulting dynamics on the eigenvalues, 
which we still call DBM, has the GUE eigenvalue distribution as the invariant measure. Thus the result 
of Johansson can be interpreted as stating that the local statistics of GUE is reached via DBM for time of 
order one. In fact, by analyzing the dynamics of DBM with ideas from the hydrodynamical limit, we have 
extended Johansson's result to ^ N-^/^ [19]. The key observation of [19] is that the local statistics of 
eigenvalues depend exclusively on the approach to local equilibrium which in general is faster than reaching 
global equilibrium. Unfortunately, the identification of local equilibria still uses explicit representations of 
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correlation functions by orthogonal polynomials (following e.g. [29]), and the extension to other ensembles 
is not a simple task. 

Therefore, the universality for symmetric random matrices remained open and the only partial result is 
given by Tao-Vu (Theorem 23 in [35]) for Wigner matrices with the first four moments of the matrix elements 
matching those of GOE. 

In [18], together with B. Schlcin and H.T. Yau, we have introduced a general approach based on a 
new stochastic flow, the local relaxation flow, which locally behaves like DBM, but has a faster decay to 
equilibrium. This approach completely circumvents explicit formulae. It is thus applicable to prove the 
universality for a very broad class of matrices that includes hermitian, symmetric and symplectic Wigner 
matrices and in principle it works also for Wishart matrices and general /3-ensembles. The heart of the proof 
is a convex analysis and the model specific information involve only estimates on the accuracy of the local 
density of states. For simplicity of the presentation, we will focus on the hermitian and symmetric cases, the 
necessary modifications for the other cases are technical. 

We present results only about the convergence of the local correlation functions; this implies, among 
others, that the distribution of the gap (difference between neighboring eigenvalues) is universal as well 
(Wigner surmise). In particular, short gaps are suppressed, i.e. the eigenvalues tend to repel each other. 
This feature is characteristic to the strongly correlated point process of eigenvalues of random matrices in 
contrast to the Poisson process of independent points. 

Universality of local eigenvalue statistics is believed to hold for a much broader class of matrix ensembles 
than we have introduced. Wigner has originally invented random matrices to mimic the eigenvalues of the 
then unknown Hamiltonian of heavy nuclei; lacking any information, he assumed that the matrix elements 
are i.i.d. random variables subject to the hermitian condition. Conceivably, the matrix elements need not 
be fully independent or identically distributed for universality. There is little known about matrices with 
correlated entries, apart from the unitary invariant ensembles that represent a very specific correlation. In 
case of a certain class of Wigner matrices with weakly correlated entries, the semicircle law and its Gaussian 
fiuctuation have been proven [31, 32]. 

Much more studied are various classes of random matrices with independent but not identically dis- 
tributed entries. The most prominent example is the Anderson model [2], i.e. a Schrodinger operator on a 
regular square lattice with a random potential. Restricted to a finite box, it can be represented by a matrix 
whose diagonal elements are i.i.d. random variables; the deterministic off-diagonal elements arc given by the 
Laplacian. In space dimensions three or higher and for weak randomness, the Anderson model is conjectured 
to exhibit metal-insulator transition. Near the spectral edges, the eigenfunctions arc localized [22, 1] and 
the local eigenvalue statistics is Poissonian [28]; in particular there is no level repulsion. It is conjectured, 
but not yet proven, that in the middle of the spectrum the eigenfunctions arc extended (some results on 
the quantum diffusion and delocalization of eigenfunctions are available in a certain scaling limit [14, 7]). 
Furthermore, in the delocalization regime the local eigenvalue statistics are expected to be given by GUE or 
GOE statistics, depending whether the time reversal symmetry is broken by magnetic field or not. Based 
upon this conjecture, local eigenvalue statistics is used to compute the phase diagram numerically. It is very 
remarkable that the random Schrodinger operator, represented by a very sparse random matrix, exhibits the 
same universality class as the full Wigner matrix, at least in a certain energy range. 

An intermediate class of ensembles between these two extremes is the family of random band matrices. 
These are hermitian or symmetric random matrices H with independent but not identically distributed 
entries. The variance of Hij depends only on ji — j| and it becomes negligible if |i — j| exceeds a given 
parameter W, the band-width; for example, E|i7y |^ ^ exp(— |i — j\/W). It is conjectured that for narrow 
bands, W ^ VN, the local eigenvalue statistics is Poisson, while for broad bands, W ^ ^/N it is given by 
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GUE or GOE, depending on the symmetry class. (Localization properties of H ior W N^^^ has been 
recently shown [30] but not local statistics.) To mimic the three dimensional Anderson model, the rows and 
columns of H may be labelled by a finite domain of the three dimensional lattice, i, j E A C . The only 
rigorous result for this three dimensional band matrix concerns the density of states by establishing that 
Wigner semicircle law holds as — >■ cjo [11]. 

Finally, we mention that universality of local eigenvalue statistics is often investigated by supersymmctric 
techniques in the physics literature. These methods are extremely powerful to extract the results by saddle 
point computations, but the analysis justifying the saddle point approximation usually lacks mathematical 
rigor. It is a challenge to the mathematical physics community to put the supersymmetric method on a solid 
mathematical basis; so far only the density of states has been investigated rigorously by using this technique 
[11]. 



2 Local semicircle law, delocalization and level repulsion 

Each approach that proves bulk universality for general Wigner matrices requires first to analyze the local 
density of eigenvalues. The Wigner semicircle law [38] (and its analogue for Wishart matrices, the Marchcnko- 
Pastur law [25]) has traditionally been among the first results established on random matrices. Typically, 
however, the empirical density is shown to converge weakly on macroscopic scales, i.e. on intervals that 
contain 0{N) eigenvalues. Based upon our results [15, 16, 17], here we show that the semicircle law holds 
on much smaller scales as well. 

To fix the notation, we assume that in the symmetric case the matrix elements of H are given by 



iV-i/2x,fc, (2.1) 



where Xik for £ < k are independent, identically distributed random variables with the distribution v that 
has zero expectation and variance 1. The diagonal elements xu are also i.i.d. with distribution V that has 
zero expectation and variance two. In the hermitian case we assume that 

hik= N-^/^{xek + iyek) (2.2) 

where Xik and yik are real i.i.d. random variables with zero expectation and variance i. The diagonal 
elements are also centered and have variance one. The eigenvalues of H will be denoted by Ai < A2 < 
• ■ • < ^N- The Gaussian ensembles (GUE and GOE) are special Wigner ensembles with Gaussian single-site 
distribution. 

We will often need to assume that the distributions ly and i7 have Gaussian decay, i.e. there exists (5o > 
such that 

/ exp [(5o2;^]d:^(x) < 00, / exp [(^qx^] di/(x) < 00. (2.3) 
Jr Jr 

In several statements we can relax this condition to assuming only subexponential decay, i.e. that there 

exists 60 > and 7 > such that 

/ exp [(5olxl'']di/(x) < 00, / exp [(5ola:l'']di/(x) < 00. (2.4) 
Jr Jr 

The matrix elements have thus variance of order 1/A^. This normalization guarantees that the spectrum 
remains bounded as — > 00, in fact the spectrum converges to [—2, 2] almost surely. Therefore the typical 
spacing between neighboring eigenvalues is of order 1/N. 
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For any / C M let Ni denote the number of eigenvalues in I. Wigner's theorem 
fixed interval / 

Mi 



states that for any 



-J^ ^ 9sc[x)dx 



almost surely as TV — >■ oo, where 



Qscix) 



1 

2^ 



V(4 



is the density of the semicircle law. This result can be interpreted as a law of large numbers for the empirical 
eigenvalue density on macroscopic scales, i.e. for intervals that contain 0{N) eigenvalues. The following 
result shows that the semicircle law holds on intervals / of length \I\ = i] > K/N for sufficienly large K . 

Theorem 2.1 [17, Theorem 3.1] Suppose that (2.3) holds. Let k > and fix an energy E E [—2 + k,2 — k]. 
Consider the interval I = — + of length rj about E. Then there exist positive constants C,c, 
depending only on k, and a universal constant ci such that for any S < Cik there is K = Ks such that 



>S'> <Ce 



(2.5) 



holds for all rj satisfying K/N < t] < 1/K . 



In particular, this result shows that Afi /Nt] converges to Qsc{E) in probability as long as 77 = fli^) is such 
that rj{N) — > and Nrj{N) — s- oo. The Gaussian decay condition (2.3) can be relaxed to (2.4) if > N^^^' 
with any e > at the expense of a weaker bound on the right hand side of (2.5), see Section 5 of [20]. The 
estimate also deterioriates if the energy is close to the edge, see Proposition 4.1 of [19] for a more precise 
statement. Based upon our proofs, similar estimates were given in [35, Theorem 56] for energies in the bulk 
and somewhat stronger bounds in [36, Theorem 1.7] for the edge. 

Sketch of the proof. For any z = E + irj, ?/ > 0, let 



m(z) = mM(z) = — Tr — 



(2.6) 



be the Stieltjes transform of the empirical density of states and let 

ITT-sciz) = f ^^'^^ ^ dx 



be the Stieltjes transform of the semicircle law. Clearly Qri{E) — :^Im m{z) gives the normalized density of 
states of iJ around E regularized on a scale 77. Therefore it is sufficient to establish the convergence of m[z) 
to rUsci^z) for small 77 = Im z. 

The first step of the proof is to provide an upper bound on Mi. Let B'^^^ denote the (N — 1) x (N — 1) 
minor of Tl after removing the fc-th row and fc-th column. Let A^*"' . a — 1,2,. ..A^— 1 denote the eigenvalues 
of B^^'^ and VL'a^ denote its eigenvectors. Computing the (/c, fc) diagonal element of the resolvent (B ~ z) ^ 
we easily obtain the following expression for m{z) 



m{z) 



1^1 1 ^ 

fe=i ■ ' 



fc=i 



T-kk 



N-1 Jk) 



-E 



-\ -1 



1 Aa 



(2.7) 
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where 

- iV|aW • ui^-f , (2.8) 

and a^*"'^ is the fc-th column of H without the diagonal element h^u- Taking the imaginary part, and using 
A// < CIm m{z), we have 



^ -1 



k=l 



Mi<CN^'Y.\ E ^i'^ • (2-9) 



It is an elementary fact that the eigenvalues of H and B'^^\ for each fixed fc, are interlaced, i.e. the number 
of \^a^ in / is at least Mi — 1. For each fixed k the random variables {^i*^' : a = 1, 2, ... — 1} arc almost 
independent and have expectation value one, thus the probability of the event 



is negligible for small S [17, Lemma 4.7]. On the complement of all ilk wc thus have from (2.9) that 



Mi < 



d{Afi ~ 1) ■ 



from which it follows that Afi < CN-q with very high probability. 

The second step of the proof is to establish that m{z) and nisc{z) are close. Let 'trS^\z) denote the 
Stieltjes transform of the empirical distribution of the eigenvalues \^a^ of B^^\ Then it follows from (2.7) 
that 

1 ^ 1 
"^W^l^El n M ik)i ^ y (2.10) 



holds, where 



a = l '^a Z 



Fixing the matrix B^''\ wc view Xk as a random variable of the independent a*-*^^ vector alone. Using 

(k) 

again that the nominators — 1 are almost independent and have zero expectation, we obtain that Xk is 
bounded by (Nr])~^ with high probability [17, Lemma 6.1]. The interlacing property guarantees that m(z) 
and m('^)(z) are close. Since hkk is also small, we obtain from (2.10) that 

m{z) -i- V ^ ^ ^ . (2.11) 

^ ' N ^ miz) + z + £k 

k—l 

where Ek are small with very high probability. Note that the Stietljes transform of the semicircle law is the 
solution of the equation 

msc{z) = (2.12) 

■msc(z) + z 

that is stable away from the spectral edges, z = ±2. Comparing the solution of (2.11) and (2.12) we obtain 
that \m — msc\ is small. Strictly speaking, this argument applies only for 77 > (log A^)'''/A^ since the smallness 
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of each is guaranteed only apart from a set of probability e"^^'^^ [17, Lemma 4.2] and there are A'' possible 
values of k. On very short scale, our proof uses an additional expansion of the denominators in (2.11) up to 
second order and we use that the expectation of Xk, the main contribution to Sk, vanishes [17, Section 6]. q 

The second result concerns the dclocalization of eigenvectors. The motivation comes from the Anderson 
model. In the infinite volume, the extended states regime is usually characterized by the absolute continuity 
of the spectrum; such characterization is meaningless for finite matrices. However, the lack of concentration 
of the eigenfunctions for the finite volume approximations of the Anderson Hamiltonian is already a signature 
of the extended states regime. 

If V is an £^-normalized eigenvector of H, then the size of the i'P-norm of v, for p > 2, gives infor- 
mation about dclocalization. Complete dclocalization occurs when ||v||p < N^^/^+^/p (note that ||v||p > 
CN~^/^'^^^P\\v\\2). The following result shows that eigenvectors are fully delocalized with a very high prob- 
ability. 

Theorem 2.2 [17, Corollary 3.2] Under the conditions of Theorem 2.1, for any \E\ < 2, fixed K and 
2 < p < oo we have 

pjav : = Av, |A - £:| < ^, ||vj|2 = 1, l|v||p > MN-i+p | < Ce""^ 

for AI and N large enough. 

The proof is an easy consequence of Theorem 2.1 and will be omitted here. 

The local semicircle law asserts that the empirical density on scales 77 ^ 0{1/N) is close to the semicircle 
density. On even smaller scales 77 < 0{1/N), the emprical density fluctuates, but its average, Mg.,f{E), 
remains bounded uniformly in rj. This is a type of Wegner estimate that plays a central role in the localization 
theory of random Schrodinger operators. In particular, it says that the probability of finding at least one 
eigenvalue in an interval / of size rj = e/N is bounded by Ce uniformly in N and e < 1, i.e. no eigenvalue can 
stick to any value. Furthermore, if the eigenvalues were independent (Poisson process), then the probability 
of finding n = 1,2,3, .. . eigenvalues in / were proportional with e". For random matrices in the bulk of 
the spectrum this probability is much smaller. This phenomenon is known as level repulsion and the precise 
statement is the following: 

Theorem 2.3 [17, Theorem 3.4 and 3.5] Suppose (2.3) holds and the measure v is absolutely continuous 
with a strictly positive and smooth density. Let \E\ < 2 and I = [E — ri/2, E + ri/2] with rj ~ e/N . Then for 
any fixed n, 

P^-A/- > Ti-i <- / *^»^" [hermitian case] , . 

r{JVi >n]S^ [symmetric case] ^ ^> 

uniformly in e < 1 and for all sufficiently large N . 

The exponents arc optimal as one can easily see from the Vandcrmonde determinant in the joint proba- 
bility density (1.2) for unitary ensembles. The sine kernel behavior (1.1) implies level repulsion (and even a 
lower bound on ¥[Afi > n)), but usually not on arbitrarily small scales since sine kernel is typically proven 
only as a weak limit (see (3.2) later). 
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Sketch of the proof. The starting point is formula (2.7) together with 



Ni < CNrilmm{E + irj). 



This imphes 



N 



with 



flfc 



1 A^-i 

Q=l \'^a 



(fc) 



Ef + 772 



l-kk 



1/2 



(2.14) 



-E 



(aL'^ - i?)e^' 



(fc) 



Ef + 772 



where a^^ and 6^ are the imaginary and real part, respectively, of the reciprocal of the summands in (2.7) 
and was defined in (2.8). The proof of Theorem 2.1 relied only on the imaginary part, i.e. in (2.14) 
was neglected. In the proof of Theorem 2.3, however, we make an essential use of hk as well. Since typically 
1/iV < \}\a^ — E'l, we note that a| is much smaller than h\ if 77 ^ 1/A^ and this is the relevant regime for 
the Wegner estimate and for the level repulsion. 

Assuming a certain smoothness condition on the distribution dz^, the distribution of the variables 
will also be smooth even if we fix an index fc and we condition on the minor B^'^), i.e. if we fix the eigenvalues 
aL'^' and the eigenvectors ui'^-'. Although the random variables = N\a^^'^ ■ xx'a'^ are not independent for 
different a's, they are sufficiently decorrelated so that the distribution of &fc inherits some smoothness from 
a'^'^^. Sufficient smoothness on the distribution of 6^ makes the expectation value (a^ +&^)~^^^ finite for any 
p > 0. This will give a bound on the p-th moment on Mi which will imply (2.13). 

We present this idea for hermitian matrices and for the simplest case fc = 1. From (2.14) we have 



(fc) 



F(AA/ > 1) < EAA| < C{Nr]f^ 

Dropping the superscript fc = 1 and introducing the notation 

A^(A, - E) 



1 



we have 



" iV2(A„-S)2+e2' 
P(AA/ > 1) < Ce^] 



7V2(A„ -S)2 + e2' 



JV-1 



-\ -1 



(2.15) 



Q — 1 a — 1 

From the local semicircle law we know that with very high probability, there are several eigenvalues Aq 
within a distance of 0(l/iV) of E. Choosing four such eigenvalues, we can guarantee that for some index 7 



c^, c^+i > Ce, 



^7+2, «7+3 



> c 



(2.16) 



for some positive constant C. If ^q,'s were indeed independent and distributed according to the square of 
a complex random variable with a smooth and decaying density d/i(z) on the complex plane, then the 
expectation in (2.15) would be bounded by 



sup 

E 



1 



(c^|z^|2 + Ct, + i|z^+i|2) + - d^+2|z^+2p - ^7+3^7+3^) j=0 



J|d/i(z^+j). 



(2.17) 
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Simple calculation shows that this integral is bounded by Ce~^ assuming the lower bounds (2.16). Combining 
this bound with (2.15), we obtain (2.13) for n = 1. The proof for the general n goes by induction. The 
difference between the hermitian and the symmetric cases manifests itself in the fact that 's are squares of 
complex or real variables, respectively. This gives different estimates for integrals of the type (2.17), resulting 
in different exponents in (2.13). q 



3 Sine kernel universality 

Let /(Ai, A2, . . . , Aat) denote the symmetric joint density function of the eigenvalues of the N x N Wigner 
matrix H. For any k > 1 we define the fc-point correlation functions (marginals) by 



pj^'(Ai, . . . , Afc) = / /(Ai, A2, . . . , AAr)dAfe+i . . .dA 



N- 



We will use the notation P^^^qu^ ^-nd P^^^qqe ^'^^ ^'^'^ correlation functions of the GUE and GOE ensembles. 

We consider the rescaled correlation functions about a fixed energy E under a scaling that guarantees 
that the local density is one. The sinc-kcrncl universality for the GUE ensemble states that the rescaled 
correlation functions converge weakly to the determinant of the sine-kernel, K{x) = , i.e. 

b::M^^-'«^-(^+]v5^-^ (3.1) 

as — > 00 for any fixed energy \E\ < 2 in the bulk of the spectrum [27, 13]. Similar result holds for the 
GOE case; the sine kernel being replaced with a similar but somewhat more complicated universal function, 
see [26]. Our main result is that universality (3.1) holds for general hermitian or symmetric Wigner matrices 
after averaging in the energy E: 

Theorem 3.1 [18] Let H be an N x N symmetric or hermitian Wigner matrix with normalization defined 
at the beginning of Section 2. Suppose that the distribution u of the matrix elements has subexponential decay 
(2.4). Let k > 1 and O : M'^ — > M be a continuous, compactly supported function. Then for any \E\ < 2, we 
have 

lim lim — / du / dai . . . da^ 0(ai, . . . , a^) 

S^ON^^25 Je-5 Jr" (3 2) 



[£»sc(w)]^ "'*/V Ngsc{v) Ngsc{v)J 

where f[ stands for GOE or GUE for the symmetric or hermitian cases, respectively. 

For the hermitian case, the first result on universality beyond the GUE was due to Johansson [24] (based 
upon [6]) under the condition that ly has a Gaussian component with a positive variance independent of N. 
His method was extended in [4] to Wishart matrices. The variance of the necessary Gaussian component was 
reduced to iV^'^/^+'^ in [19] under the additional technical assumptions that the measure f is smooth and it 
satisfies the logarithmic Sobolev inequality. The local statistics was identified via orthogonal polynomials. 
The Gaussian component assumption was first removed completely in [20] under the condition that the 
density of the probability measure i' is positive and it possesses a certain number of derivatives. Shortly 
after [20] appeared on the arXiv, the same result using a different method has been posted [35] without 
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any regularity condition on v provided that the third moment vanishes and v is supported on at least three 
points. Combining the two methods, all conditions on v apart from the subexponential decay (2.4) were 
removed in a short joint paper [21]. 

The methods of [20] and [35] both rely on the explicit formula of Brezin and Hikami [6] , exploited also 
in [24] , for the correlation functions of the Wigner matrix with Gaussian convolution. This formula reduces 
the problem to a saddle point analysis. The saddle points are identified by solving an equation involving the 
Stieltjcs transform mN{z) (2.6) with rj = Imz corresponding to the variance of the Gaussian component: 
precise information on miq{z) for a smaller rj implies that a smaller Gaussian component is sufficient. 

In our work [20] we used the convergence of mN{z) to msc{z) for very small rj = N^^^^ established along 
the proof of Theorem 2.1. To remove this tiny Gaussian component, we have compared the local eigenvalue 
statistics of a given Wigner matrix H with that of Hg + sV for which the saddle point analysis applies. Here 

— 7] = N^^^^ and the new Wigner matrix Hg was chosen such that the law of Hg + sV be very close 
to H. Since Gaussian convolution corresponds to running a heat flow on the matrix elements, Hg could, in 
principle, be obtained by running the reverse heat flow on the elements of H. Although the reverse heat flow 
is undefined for most initial conditions, one can construct an appoximation to the reverse heat flow that is 
well defined and yields Hg with a required precision assuming sufficient smoothness on ly. Technically, we 
use Ornstein-Uhlenbeck process instead of the heat flow to keep the variance constant. We also mention 
that the result of [20] is valid for any fixed energy E, i.e. dv averaging in (3.2) is not necessary. 

Tao and Vu [35] have directly compared local statistics of the Wigner matrix H and that of the matrix with 
order one Gaussian component for which Johansson has already proved universality. Their main technical 
result [35, Theorem 15] states that the local eigenvalue statistics of two Wigner matrices coincide as long 
as the first four moments of their single site distributions match. It is then an elementary lemma from 
probability theory ([35, Gorollary 23] based upon [8]) to match to order four a given random variable with 
another random variable with a Gaussian component. 

The proof of Theorem 3.1 for the symmetric case requires a new idea since the formula of Brezin and 
Hikami is not available. While the four moment theorem of [35] also applies to this case, there is no reference 
ensemble available. In the next sections we describe our new approach that proves universality for both 
hermitian and symmetric matrices without relying on any explicit formulae. 



4 Dyson Brownian motion 

The joint distribution of the eigenvalues x = (xi, a;2, . . . , xjv) of the Gaussian ensembles is given by the 
following measure 



g-H(x) 

/i — /ijv(dx) = — dx, 



/3 



•H(x) = N 



N o 



N 



(4.1) 



where (3 = 1 for GOE and (3 = 2 for GUE. For definiteness, we consider the /3 = 1 GOE case and we assume 



that the eigenvalues are ordered, i.e. ^ is restricted to Eat = {x e 



Xi < X2 < 



< Xn}- 



Suppose the matrix elements evolve according to the Ornstein-Uhlenbeck process on M, i.e. the density 
of their distribution i/t = Ut{x)dx satisfies 



1 92 



d 



(4.2) 
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The Ornstein-Uhlenbeck process (4.2) induces a stochastic process, the Dyson Brownian motion, on the 
eigenvalues with a generator given by 



T.2N^^^^{ 4'''^ 2N^ x,~x, 



(4.3) 



acting on L^(/i). The measure /i is invariant and reversible with respect to the dynamics generated by L. 
Let 



r ^ 1 f 

D{f) = - / fLfcki = E m / 



(4.4) 



be the corresponding Dirichlet form. Denote the distribution of the eigenvalues at time t by /t(x)/x(dx). 
Then /( satisfies 

dtft = Lft (4.5) 

with initial condition /q given by the eigenvalue density of the Wigner ensemble. Dyson Brownian motion 
is the corresponding system of stochastic differential equations for the eigenvalues x(i) that is given by (see, 
e.g. Section 12.1 of [23]) 



dxi = 



N 



^x. 



2N ^ Xi - x. 



di. 



l<i<N, 



(4.6) 



where {Bi : 1 < i < N} is a collection of independent Brownian motions. Note that the equations (4.5) 
and (4.6) are defined for any /3 > 1, independently of the original matrix models. Our main technical result 
(Theorem 5.1) holds for general /? > 1. 



5 Local Relaxation Flow 

The Hamiltonian of the invariant measure fi of the Dyson Brownian motion is convex, with Hessian bounded 
from below 

HessH > ^ 

on the set Eat. By the Bakry-Emery criterion, this guarantees that /i satisfies the logarithmic Sobolcv 
inequality and the relaxation time to equilibrium is of order one (note the additional 1/A^ factor in the 
Dirichlet form (4.4) that rescales time). 

We now introduce the local relaxation measure, which has the local statistics of GOE (or GUE) but 
generates a faster decaying dynamics. Let jj be the semicircle location of the j-th eigenvalue, i.e. 

7j = nscU/N), nsc{E) := gsc{x)dx. 

J —00 

We fix a regularization parameter 77 ^ 1 and we replace the interaction potential between Xj and far away 
particles by a regularized mean field potential 

= E log(N-7fc|+^) (5.1) 

k:\k-]\>N7l 
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Strictly speaking, Wj{x) is defined by this formula only in an interval of size Nrj about and we use a 
quadratic extension beyond, but we leave this technicality aside. 

The local relaxation measure wjv = w is a Gibbs measure defined by the Hamiltonian 

H = iV^te + VK,(.T,)|+/3^Iog|x,-x,|-|^ E ^ogi\x.-x,\+rj). 

j=l I ) i<3 i ]:\j-i\>Nri 

We often write uj = ipn where ip is the Radon-Nykodim derivative. The local relaxation flow is defined to be 
the reversible dynamics w.r.t. w characterized by the generator L defined by 

/ fLgd^ = -±Y. f d.fd.gdL.. (5.2) 

Explicitly, L is given by 

j k,\k~j\>N7i I ^ ''I ' ' 

Simple calculation shows that the mean field potential is uniformly convex with 

inf inf W''(x) > cTi~^/^. (5.4) 

This will guarantee that the relaxation time to equilibrium uj for the L dynamics is of order ri~^/^ . 
We recall the definition of the relative entropy of with respect to any probability measure dA 

^a(/) = j /log/dA, Sx{f\^) = j flog{f/yj)dX 

Our main technical result is the following theorem that states that the relaxation time r for specific local 
observables is much shorter than order one. 

Theorem 5.1 (Universality of Dyson Brownian Motion for Short Time) Suppose that Sfj,{fo\ip) < 
CN™" for some m fixed. Let r = rj^^'^N^ with some e > and assume that rj > 7V^3/55+e^ Assume that there 
is a positive number A such that 

snp f by tdfi<CT]-^ A. (5.5) 

0<t<T jj 

Let G be a bounded smooth function with compact support. Then for any fixed n > 1 and J C [1, . . . , A'^] we 
have 



< 



CA 



]\fl-e^5/3 



We emphasize that Theorem 5.1 applies to all (3 > 1 ensembles and the only assumption concerning the 
distribution ft is in (5.5). In case of the original Wigner ensembles /3 = 1, 2, the critical constant A can be 
estimated under an additional assumption. 
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Lemma 5.2 Let fo be the joint density of the eigenvalues of a Wigner matrix. Suppose that the measure dv 
of its single site distribution satisfies the logarithmic Sobolev inequality. Then the constant A in (5.5) can be 
estimated as 

A < a?r'^*/^+" (5.6) 

for any cr > 0. 

For the proof of this lemma, we can estimate bj as 

as long as Xk is sufficiently near 7^ so that sgn(xj — 7fc) = sgn(7j — 7^) holds for \j — k\ > Nr]. The 
average difference between Xk and ¥,Xk can be estimated using the logarithmic Sobolev inequality for ly. The 
average of \Kxk — 7fc| is estimated in Proposition 4.2 of [17] that was a consequence of the local semicircle 
law. Combining these results with information on the lowest and largest eigenvalues [37], we can show that 
TT Ek -lk\< A^-3/5+^ and this yields (5.6). □ 

Combining Lemma 5.2 with Theorem 5.1 and choosing rj appropriately, we see that the local eigenvalue 
statistics of fr with r > 7V~^/55+^ coincides with that of the global equilibrium measure, i.e. with GOE or 
GUE. For hermitian matrices, the same statement was already proven in [19] even for r > N^^^^ by using 
Brezin-Hikami formula, but the current approach is purely analytical and it applies to symmetric matrices 
as well. Using the reverse heat flow argument, we can show that the local statistics of fo is also given by 
GOE or GUE assuming that the initial distribution v is sufficiently smooth. The smoothness condition and 
the additional requirement that ly satisfies the logarithmic Sobolev inequality can be removed by applying 
the four moment theorem of [35]. 



6 Proof of Theorem 5.1 

We first list the key new ideas of behind the proof of Theorem 5.1, then we formulate the corresponding 
results. 

I. The key concept is the introduction of the local relaxation flow (5.2) which has the following two 
properties: (1) The invariant measure for this flow, the local relaxation measure uj has the same local 
eigenvalue statistics as the GOE or GUE. (2) The relaxation time of the local relaxation flow is much 
shorter than that of the DBM, which is of order one. 

II. Suppose we have a density q w.r.t. uj that evolves with the local relaxation flow. Then, by differentiating 
the Dirichlet form w.r.t. lj we will prove that the difference between the local statistics of quj and uj 
can be estimated in terms of the Dirichlet form of q w.r.t. uj. Hence if the Dirichlet form is small, the 
local statistics of quj is independent of q. 

III. It remains to show that the Dirichlet form of g = ftH w.r.t. uj is small for t sufficiently large (but still 
much less than order one). To do that, we study the evolution of the entropy of ftfJ, relative to uj. This 
provides estimates on the entropy and Dirichlet form which serve as inputs for the Step II to conclude 
the universality. 
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The first ingredient to prove Theorem 5.1 is the analysis of the local relaxation flow which satisfies the 
logarithmic Sobolev inequality and the following dissipation estimate. 

Theorem 6.1 Suppose (5.4) holds. Consider the equation 

dtqt = Lqt (6.1) 

with reversible measure uj. Then we have the following estimates 



dtDUVTt) < -Cr^-'/'DUVTt) - 2^ / E _^ .2 (^^VTt-^JV¥tfduJ, (6.2) 

\i-j\<Nr, ' 3' 

1 f 1 

d,sy (^^ „ ^^■)2 - d.VTsfduj < D^^o) (6.3) 



\i-]\<N,j 

and the logarithmic Sobolev inequality 

sUq) < Cv'/'DUV^) (6.4) 

with a universal constant C. Thus the time to equilibrium is of order rj^^^ : 

SM<e~^''^''''sUqo)- (6.5) 

The proof follows the standard argument in [3] (used in this context in [19]). The key input is the 
following lower bound on the Hessian of "H 

5i.(v,(V=«)v)>C,-./=l||v||^ + ^ E («) 

\i-j\<Nri ^ ' ^' 

The first term is due to convexity of the mean field potential (5.4). The second term comes from the 
additional convexity of the local interaction and it corresponds to "local Dirichlet form dissipation" . The 
estimate (6.3) on this additional term plays a key role in the next theorem. 

Theorem 6.2 Suppose that the density go satisfies Suj{qa) < CN"^ with some m > fixed. Let G be a 
bounded smooth function with compact support and let J C {1, 2, . . . , N}. Set t = rj^^^N^ . Then for any 
n > 1 fixed we have 



ieJ ieJ 



(6.7) 



<C\l +Ce 



Sketch of the proof. Let qt satisfy 

dtqt = Lqt 

with an initial condition q^. Thanks to the exponential decay of the entropy on time scale r ^ fl^^^-, see 
(6.5), difference between the local statistics w.r.t g^-w and qoo'^ = a; is subexponentially small in TV. To 
compare go with q-r, by differentiation, we have 



y" E ^i^i^'' ~ Xt+n))qTduj - y" E ~ Xi+n))qod'^ 
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J ds J — G'{N{xi - Xi+n))[diqs - 9i+Kgs]da 



From the Schwarz inequality and dq ~ l^id^fq the last term is bounded by 

- 1/2 



ds 



1 



-, 1/2 



where we have used (6.3) and G'{N{xi — Xi-^n))'^{xi ~ — C /N"^ 



< C 



N 



(6.8) 



□ 



Notice if we use only the entropy dissipation and Dirichlet form, the main term on the right hand side 
of (6.7) will become V St. Hence by exploiting the local Dirichlet form dissipation coming from the second 
term on the r.h.s. of (6.2), we gain the crucial factor N~^/^ in the estimate. 

The final ingredient to prove Theorem 5.1 is the following entropy and Dirichlet form estimates. 

Theorem 6.3 Suppose the assumptions of Theorem 5.1 hold. Let t = rj^^^N^ and let gt = ft/i^ so that 
Sp,{ft\'>P) = Si^idt)- Then the entropy and the Dirichlet form satisfy the estimates: 



(6.9) 



Sketch of the proof. Recall that dtft ~ Lft. The standard estimate on the entropy of ft with respect to 
the invariant measure is obtained by differentiating it twice and using the logarithmic Sobolev inequality. 
The entropy and the Dirichlet form in (6.9) are, however, computed with respect to the measure uj. This 
yields the additional second term in the following identity [39] that holds for any probability density ipt: 

dtS^iftl^Pt) = -^Y.j (d.Wtf i^tdfi + J gt{L - dt)i^t d^i , 
where gt ~ ft/i^t- In our application we set ^pt ~ = '^/m, hence we have 

c'tS'„(5t) = --^X] /("5jV5t)^dw + I Lgtduj + f bjdjgtdu}. 
Since uj is invariant, the middle term on the right hand side vanishes, and from the Schwarz inequality 



dtSM < -D^i^t) + CnY, j b]gtduj. 



(6.10) 



Together with (6.4) and (5.5), we have 

dtSM < -Cir^^'sM + Cv'^^- 



.11) 



which, after integrating it from t — to t/2, proves the first inequality in (6.9). The second inequality can 
be obtained from integrating (6.10) from t = t/2 to t = t and using the monotonicity of the Dirichlet form 
in time. q 
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Finally, we sketch the proof of Theorem 5.1. With the choice of t = rj^^^N'^ and qq = fr/^, Theorems 
6.1, 6.2 and 6.3 directly imply 



< 



CA 



(6.12) 



Ce 



i.e. the local statistics of /r/^ and uj are close. Clearly, equation (6.12) also holds for the special choice 
/o = 1 (for which /i- = 1), i.e. local statistics of fi and oj can also be compared. This completes the proof of 
Theorem 5.1. q 
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